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On  the  Subdifferentiability  of  Functions  of  a  Matrix 

Spectrum 
I:  Mathematical  Foundations 

James  V.  Burke'  and  Michael  L.  Overton^ 


Abstract 

We  consider  analytic  matrix  valued  mappings  A:C  i— >  C"""  and  study  the  vari- 
ational properties  of  the  spectrum  of  A{().  Of  particular  interest  are  a(()  and  p(f), 
respectively  the  spectr2d  abscissa  and  the  spectral  radius  of  A{c).  In  this  paper,  we 
introduce  the  mathematical  techniques  required  for  this  analysis.  We  begin  with  poly- 
nomials and  discuss  the  bifurcation  of  the  roots  of  a  polynomial  having  analytic  coef- 
ficients. It  is  this  bifurcation  phenomenon  that  leads  to  the  nonlipschitzian  behavior 
of  the  types  of  functions  that  we  wish  to  study.  Puiseux- Newton  series  and  diagrams 
are  then  introduced  as  a  means  for  analyzing  these  bifurcations.  It  is  shown  how  these 
techniques  can  be  used  to  describe  the  tangent  cone  to  certain  sets  of  stable  polynomi- 
als. 

Matrices  and  polynomials  are  connected  via  characteristic  polynomials.  Further 
properties  of  the  spectrum  of  a  matrix  yt'"'  are  obtained  from  a  block  diagonalization 
of  A^°\  where  the  kth  diagonal  block  is  an  nk  by  th  upper  triangular  matrix,  with 
a  constant  diagonal  whose  value  is  an  eigenvalue  of  /l*"'  with  multiplicity  n/t.  By 
using  results  of  Arnold  on  the  versal  deformation  of  a  matrix,  we  show  how  the  results 
concerning  polynomials  can  be  translated  into  into  results  about  matrices.  In  the  case 
that  the  block  diagonal  form  is  the  Jordan  form,  the  conditions  which  are  obtained 
reduce  to  conditions  on  generalized  traces  of  a  matrix. 

1      Introduction  ^ 

In  this  study  we  consider  the  variational  properties  of  two  related  functions  of  the  spectrum 
of  an  analytic  matrix  valued  function.  Let  A  be  an  analytic  matrix  valued  mapping  from  C 
to  C"^";  thus  each  element  of  A{()  is  an  analytic  (holomorphic)  function  of  a  single  complex 
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parameter  e.  Define 
and 


a(e)  =  max{Re  A  :  A  €  ^{c)], 


p(f)  =  max{|A|:A€S(0}- 
where  E(f)  is  the  spectrum  of  A{c),  i.e.  the  roots  of  the  characteristic  polynomial 

P{€,X)  =  det(XI-A{t))^0. 


(1) 

(2) 

(3) 


The  elements  of  E(€)  are  called  the  eigenvalues  of  j4(t),  the  function  a(()  is  called  the  spectral 
abscissa  of  A{(),  and  p{e)  is  called  the  spectral  radius.  The  spectral  abscissa  and  radius  are 
associated  with  stability  properties  of  the  matrix  A(()  and  are  important  quantities  in 
many  applications.  The  spectral  abscissa  is  relevant  when  stability  is  defined  in  terms  of 
exponentiation  of  a  matrix,  while  the  spectral  radius  is  relevant  if  stability  is  defined  in 
terms  of  matrix  powers.  The  former  typically  arises  in  applications  involving  difierential 
equations  while  the  latter  is  relevant  in  the  case  of  difference  equations  (which  themselves 
often  arise  as  approximations  to  differential  equations). 

Both  the  functions  a(e)  and  p(e)  are,  in  general,  nonlipschitzian.  The  reason  for  this  is 
the  well  known  phenomenon  of  bifurcation  of  roots  that  occurs  when  a  polynomial  with  a 
multiple  root  is  subjected  to  analytic  perturbation.  This  behavior  was  well  understood  by 
Newton  as  early  as  1680  and  lead  to  his  development  of  so-called  Puiseux-Newton  series  as  a 
means  for  describing  these  roots.  (The  name  Puiseux  is  associated  with  these  series  because 
it  was  he  who,  two  hundred  years  later,  proved  their  convergence  [10].)  The  Puiseux-Newton 
series  is  a  power  series  in  fractional  powers  of  the  perturbation  parameter  e,  with  the  smallest 
such  power  being  greater  than  or  equal  to  the  inverse  of  the  degree  of  the  polynomial.  Thus, 
in  general,  a(e)  and  p(e)  can  only  be  said  to  be  Holder  continuous  of  order  1/n.  A  simple 
example  is: 

■  0     1 


A{e)  = 


The  eigenvalues  are  the  roots  of 


A"  -  f  =  0 


i.e.  the  quantity  e '/"  times  the  nth  roots  of  unity.  Taking  f  to  be  real  and  positive,  we  thus 
have  q{()  =  p{e)  =  e^^".  A  more  interesting  example  will  be  given  shortly. 

Throughout  Part  I  of  this  paper,  the  analytical  dependence  of  the  matrix  A  is  restricted 
to  a  single  complex  variable  t,  which  we  will  often  take  to  be  real  and  positive.  In  Part  II 
[4]  we  shall  consider  dependence  of  A  on  several  complex  variables,  and  then  the  results  of 
Part  I  will  be  applied  to  obtain  results  along  curves  in  the  parameter  space  of  Part  II. 


2      Roots  of  Polynomials 

Matrix  theoretic  results  can  always  be  applied  to  obtain  insight  into  the  behavior  of  polyno- 
mials by  stating  the  results  in  terms  of  companion  matrices.  On  the  other  hand,  polynomials 
are  crude  measures  of  the  behavior  of  matrices  since  their  structure  is  far  less  rich.  Nonethe- 
less, polynomial  results  provide  an  important  starting  point  for  investigation  into  matrices. 
Because  the  eigenvalues  are  the  roots  of  the  characteristic  polynomial  P(i,  A),  we  begin 
by  addressing  the  conditions  on  the  coefficients  of  the  characteristic  polynomial  that  are 
imposed  by  requiring  a(f )  or  p(()  to  have  certain  properties.  A  key  point  is  to  simplify  the 
class  of  polynomials  that  one  needs  to  consider.  From  [2,  pp.376-381],  there  is  an  open  disk 
in  the  complex  plane  containing  the  origin  on  which  the  polynomial  P((,X)  has  the  unique 
representation 

PU,^)=     n    ((^^-^r'+/?H(e)(^*-A)"'-'-f  ■•  +  /?*„, (0)  (4) 

Akei;(o) 

where  the  product  index  runs  over  each  eigenvalue  At  (with  multiplicity  n^)  in  11(0),  and 
the  /3t;(f)  are  analytic  functions  vanishing  at  €  =  0.  For  the  moment  we  shall  assume  that 
there  is  only  one  such  eigenvalue  Aq,  with  multiplicity  no,  so  that  P  reduces  to 

P(e,  A)  =  (A  -  Ao)""  +  /?i(e)(A  -  Ao)""-'  +  ■  ■•  +  /?„o(f),  (5) 

with 

We  now  develop  the  key  result  needed  to  understand  the  variational  behavior  of  the 
spectral  abscissa  and  radius.  The  result  provides  information  about  the  first-order  behavior 
of  the  coefficients  of  P{(,X)  under  an  assumption  on  the  rate  of  change  of  its  roots  in  a 
certain  direction  in  the  complex  plane. 

Lemma  1  Consider  the  polynomial  equation  (5).  Let  j/o  ^f  o-ny  nonzero  complex  number 
and  suppose  that  all  the  roots  A(f )  of  (5)  satisfy 

Reyo(X{()-Xo}<6c  +  o(()  (6) 

for  some  real  sequence  {t"}  with  i"  [  0.   Then 

Reyo/?r^>-"o<5,  (7) 

Rey5/?^''>0,  Imy2/jO)^o,  (8) 

/?fUo,    ;  =  3,...,no.  (9) 

An  important  special  case  \s  yo  —  1,^  =  0.  In  this  case.  Lemma  1  states  the  following:  the 
tangent  cone  of  the  set  of  stable  polynomials,  i.e.  those  with  roots  in  the  left  half-plane,  at 
the  point  (A  —  Aq)"  in  the  space  of  monic  polynomials  with  analytic  coeffients,  is  contained 


in  the  set  of  polynomials  (5)  satisfying  Re  /?[''  >  0,  Re  /?^'*  >  0,  Im  0^^^  =  0,  and  /?^''  =  0, 
J  =  3, . . . ,  no.  In  fact,  this  set  equals  the  tangent  cone  [3,8]. 

The  proof  of  this  lemma  may  be  found  in  [5].  It  uses  the  Puiseux-Newton  diagram,  a 
technique  devised  by  Newton  [9]  for  computing  the  coefficients  of  Puiseux-Newton  series. 
Although  we  shall  not  repeat  the  proof  of  Lemma  1  here,  we  shall  give  some  motivating 
remarks  by  means  of  an  example. 

Suppose  that  Aq  =  0  and 


P(e,  A)  =  A^  +  eA^  +  (-e  -  ()^X  +  (f ^  +  2c' 


(10) 


Consider  the  diagram  in  Figure  1,  in  which  (a)  the  power  off  in  the  leading  nonzero  term 
of  the  jth  coefficient  is  plotted  against  j  (this  includes  the  point  (0,0)),  and  (b)  the  lower 
boundary  of  the  convex  hull  of  these  points,  namely  a  piecewise  linear  function,  is  drawn: 


Figure  1 


Newton  used  an  "Ansatz"  argument  to  show  that  the  slopes  of  this  piecewise  linear  function 
are  precisely  the  powers  of  e  in  the  leading  terms  of  the  expansions  for  A(f),  the  roots  of 
P(f,A)  =  0.  For  this  example,  the  roots  have  the  form 


A(€)  =  ±€5  + 


(11) 


and 


A(0  =  f+-    •,  (12) 

sls  can  be  verified  by  substitution  into  (10)  and  observing  cancellation.  Note  that  there  are 
two  roots  (11)  whose  leading  power  of  e  is  1/2,  reflecting  the  fact  that  the  line  segment  in 
Figure  1  with  slope  1/2  runs  from  j  =  0  to  j  =  2.  The  coefficients  of  these  e?  terms  are,  in 
this  case,  the  two  square  roots  of  — /?2  —  1;  if  /?2(f )  is  changed  to  e  —  c^ ,  we  obtain  leading 
terms  ±if  a  in  (11),  where  i  =  \/—\.  Similarly,  there  is  only  one  root  (12)  with  leading 
power  off  equal  to  1,  reflecting  the  fact  that  the  line  segment  with  slope  1  runs  from  j  =  2 


to  j  —  3,  and  the  coefficient  of  this  leading  term  is  —0^  / 0)^  =1.  (That  these  are  the 
relevant  coefficients  follows  from  the  fact  that  the  line  segment  with  slope  1  interpolates  the 
points  (2,1)  and  (3,2).) 

With  this  example  understood,  the  result  of  Lemma  1  can  now  be  explained.  Let  j/o  have 
any  nonzero  value,  and  let  L  be  the  line  in  the  complex  plane  defined  by  {z  :  Re  yaiz  —  Xo)  = 
0).  In  order  for  (6)  to  hold,  all  slopes  in  the  associated  Puiseux-Newton  diagram  must  be 
>  1/2  since,  for  example,  a  slope  with  value  1/3  corresponds  to  three  roots  with  leading 
term  es  and  coefficients  equal  to  the  three  cube  roots  of  some  complex  number,  which  means 
that  at  leEist  one  of  the  roots  must  lie  on  one  side  of  L  and  at  least  one  of  the  roots  must 
lie  on  the  other  side.  This,  then,  explains  (9).  In  the  case  of  a  slope  with  value  1/2,  the 
only  possible  way  (6)  can  hold  is  if  both  roots  lie  on  L;  this  amounts  to  a  condition  on  0^ 
which  is  given  by  (8).  Finally,  (7)  follows  directly  from  (6)  since  -0i(()  is  the  sum  of  the 
roots  A(f)  (shifted  by  Aq). 

The  result  of  Lemma  1  extends  immediately  to  general  polynomials  of  the  form  (4). 
However,  we  shall  not  need  to  state  this  explicitly;  instead  we  go  on  to  consider  the  original 
matrix  problem. 

3      Eigenvalues  of  Matrices 

A  matrix  can  be  reduced  via  similarity  transformations  to  a  variety  of  canonical  forms. 
A  finite  number  of  elementary  unitary  transformations  is  sufficient  to  reduce  a  matrix  to 
Hessenberg  form,  where  all  subdiagonals  except  the  first  are  reduced  to  zero;  this  can  be 
further  reduced  to  upper  triangular  or  Schur  form  by  a  general  unitary  transformation.  The 
spectrum  of  a  matrix  appears  on  the  diagonal  of  its  Schur  form,  but  other  information,  such 
as  invariant  subspace  information,  is  not  apparent.  In  order  to  further  reduce  the  matrix,  i.e. 
to  introduce  zeros  in  the  upper  triangle  as  well  as  the  lower,  nonunitary  transformations  are 
generally  required;  such  transformations  are  potentially  quite  ill-conditioned,  i.e.  the  norm 
of  the  transformation  times  the  norm  of  its  inverse  could  be  large.  The  ultimate  canonical 
form  is  the  Jordan  form,  where  zeros  are  introduced  everywhere  except  on  the  diagonal  and 
some  parts  of  the  first  superdiagonal.  However,  the  Jordan  form  is  a  discontinuous  function 
over  the  space  of  matrices.  See  [7]  for  an  excellent  general  discussion. 

The  canonical  form  that  we  shall  need  is  block  diagonal  form  ([7,  Section  7.1.3]).  This 
generally  requires  nonunitary  transformations  but  is  not  as  difficult  to  compute  as  the  Jordan 
form  (which  is  a  special  case).  Let  A{€)  be  the  analytic  matrix  valued  function  of  Section 
1,  and  assume  that 

A{0)^A(°^  =  PDP-\     D=Diag(  £»,,...,  i^m),  (13) 

where  the  kth  diagonal  block  N^  is  upper  triangular  with  constant  diagonal,  i.e. 

with  Nic  strictly  upper  triangular  and  hence  nilpotent.  Here  m  is  the  number  of  distinct 
eigenvalues  of  A,  so  Xk  is  an  eigenvalue  of  ^(0)  with  (algebraic)  multiplicity  rik,  the  order 


of  Nt.  The  geometric  multiplicity  of  At  is  the  nullity  of  Nk\  this  is  easily  seen  to  be  the 
number  of  independent  eigenvectors  associated  with  A*.  If  Nk  =  0,  the  eigenvalue  A*  is  said 
to  be  semtstmple  (or  nondefective);  if  Nt  has  rank  n^  —  1,  At  is  said  to  be  nonderogatory. 

Now  define  G{e)  -  P-^A(e)P,  so  that  G(0)  =  D.  A  result  of  V.I.  Arnold  [1]  states  that 
G{€)  has  the  following  versal  deformation: 

G(e)  =  Y(e)H  {€)¥{()-'  (14) 

where  Y  and  H  are  both  analytic,  Y{0)  =  I,  and  H{()  commutes  with  D  for  all  f  in  a 
neighborhood  of  0.  Now  since  D  is  block  diagonal,  H{e)  must  also  be  block  diagonal  with 
the  same  block  sizes  n],...,nm  (one  way  to  prove  this  is  that  D  is  similar,  via  a  block 
diagonal  similarity  transformation,  to  its  Jordan  form,  and  the  matrices  commuting  with  a 
Jordan  form  have  a  special  block  diagonal  structure  [1,6]).  Now  A((),  G(e)  and  H{e)  are  all 
similar  to  each  other,  so  the  eigenvalues  of  A{e)  are  given  by  the  roots  of  the  characteristic 
polynomial  of //(f),  which  is  the  product  of  characteristic  polynomials  of  the  diagonal  blocks 
Hk{e)-  The  following  lemma,  brought  to  our  attention  by  J.  Sylvester[ll],  shows  how  to 
compute  the  derivatives  of  the  coefficients  of  these  characteristic  polynomials  at  e  =  0: 

Lemma  2 

^det(A/-//i(e))|,=o         =         -tr(//^'^)(A-At)"'-'-tr(yVi//^'^)(A-At)"'-2 

tr  (yV^"'-2//(^')(A  -  At)  -  tr  (NJ^'-' hI'^), 

where  H^}'^  =  //^(O). 

Proof  First  note  that  //i(0)  is  the  kth  diagonal  block  of  G(0),  i.e.  Dk,  so  we  may  write 
Hk{i)  =  Dk+  €H[^^  +  o(e).  Let  /i  =  A  -  At,  so 

XI  -  Hk(c)  =  nl  -  Nk  -  cH['^  +  o{(). 

Then 

^detiXI-Hk{e))U=o  =  det(/// -  AT,)^  det(/ -  f^-^/ - /i-'^t)"'//!'^) 

de  de 

-^i"nr{^i-\I-^^-'Nk)-'Hi'^) 

=     -//"-'     tTiI  +  ,j-'Nk  +  (fi-'Nkf+-+{fi-'Nkr-')Hi'\ 

n 

We  are  now  in  a  position  to  prove  the  main  result  of  this  section. 

Lemma  3  Let  A(c)  —  A'-°^  +  (A^^^  +  ■  ■  ■  be  an  analytic  matrix  function  with  A^°^  having  the 
block  diagonal  decomposition  (13),  and  lei  B\,.. . ,  Bm  be  the  corresponding  diagonal  blocks 
of  P'^A^^^P  (this  matrix  is  not  block  diagonal  m  general).  Then  the  eigenvalues  X(e)  of 
A{e)  corresponding  to  the  eigenvalue  At  o/j4*°'  are  the  roots  of  the  polynomial 

(A  -  Xkr  +  /?u(e)(A  -  Xk)"'-'  +  ■  ■  •  +  /?t„.(0  =  0 


where  pkj  ore  analytic  functions  with 

Proof    Using  the  versal  deformation  (14),  we  have 

P-'^A'{0)P  =  G'(0)  ^  //'(O)  +  y"(0)//(0)  -  H(0)Y'{0), 

P-'A('^P  =  H^'^  +  Y<'^D-DY^'K 

Therefore 

tr  (Ni-'Bt)  =  tr  (Ni-'H['^)  +  tr  (Ni-\Yl'^D,  -  D,yI'^)). 

where  Y^  is  the  ^th  diagonal  bloci<  of  y''(0).  Now  note  that  NJ.'^  commutes  with  Dk,  so 
the  last  term  is  zero.  (This  follows  since  the  trace  of  E(ZD  —  DZ)  is  zero  if  and  only  if 
E  commutes  with  D\  this  easily  verified  fact  is  a  basic  tool  in  the  derivation  of  the  Arnold 
versal  deformation.)  The  proof  is  completed  by  the  application  of  Lemma  2.  g 

Remark.  Suppose  that  the  block  decomposition  is  the  Jordan  form,  i.e.  all  superdiagonals 
of  Nk  are  zero  except  the  first,  which  consists  of  zeros  and  ones.  Suppose  further  that  all 
eigenvalues  are  nonderogatory,  i.e.  no  first  superdiagonal  contains  a  zero.  Then  the  quantity 
tr  A^^~  Bk  reduces  to  tr  ^^^Bk,  the  jth  generalized  trace  of  Bjt,  which  is  defined  to  be  the 
sum  of  the  elements  on  the  (j  —  l)th  subdiagonal  of  Bk-  In  the  derogatory  case,  we  obtain 
a  sum  of  such  generalized  traces. 

4     The  Spectral  Abscissa 

We  now  obtain  the  following  result: 

Lemma  4  Lei  A(c)  =  >l'°'  +  f/l'''  + ■  ■  ■  be  an  analytic  matrix  function  with  v4'°'  having  the 
block  diagonal  decomposition  (13),  and  let  Bi, . .  .,Bm  be  the  corresponding  diagonal  blocks 
of  P~^ A^^^P .  Recall  the  definition  of  the  spectral  abscissa  a{e)  m  (1),  and  define 

A  =  {k:Re\k=Q{Q)]. 

Suppose  that 

a(e)  -  a{0)  <  8(  +  0(e)  (15) 

for  some  real  sequence  {("}  with  e"  |  0.    Then,  for  each  k  £  A, 

Re  tr  Bk  <  nkS, 

Re  tr  (NkBk)  <  0,  Im  tr  (NkBk)  =  0, 

tT{Ni-'Bk)  =  0,  j  =  ?,,...,nk. 

Proof    Clearly,  (15)  is  equivalent  to  requiring,  for  each  k  ^  A,  that 

Re(A(f)-  \k)  <  Se-\-o(e) 

for  all  A(f)  which  are  eigenvalues  of  A(e)  corresponding  to  A*.  The  proof  therefore  follows 
from  applying  Lemma  1  (with  j/o  =  1)  and  Lemma  3.  n 


5      The  Spectral  Radius 

Similarly,  we  obtain 

Lemma  5    Ustng  the  same  assumptions  as  in  the  previous  lemma,  recall  the  definition  of 
the  spectral  radius  p{i)  tn  (2),  and  define 

7^={*:|A,|=p(0)}. 

Suppose  that 

P(0  -  P(0)  <  if  +  o(e)  (16) 

for  some  real  sequence  {c")  with  e"  [  0.   Then,  for  each  k  E.TZ, 

Re  (Aittr  5t)+  |tr  NkBt\<  n^S  |At|,  (17) 

Re  (JllT  (NkBk))  <  0,  Im  (A^tr  (TV^B*))  =  0,  (18) 

tt  {Ni-' Bt)  =  0,  j  =  Z,...,n,. 

Partial  Proof    Equation  (16)  is  equivalent  to  requiring,  for  each  k  £11,  that 

|A(e)|-|At|<^f  +  o(0 
for  all  A(f)  which  are  eigenvalues  of  j4(f )  corresponding  to  A*,  i.e. 

|A(Ol'-|At|'<M|A(OI  +  |At|)  +  o(0 
or  equivalently 

Re  (A,(A(0  -h))+\  |A(f)  -  Xk?<  \f>e(\\{e)\  +  |At|)  +  o(c). 

Dropping  a  positive  term  from  the  left-hand  side,  and  using  the  Holder  continuity  of  \{(), 
we  have 

Re(Ait(A(f)-Ai))<6e|At|+o(f). 

By  applying  Lemma  1,  with  j/o  =  At,  and  Lemma  3,  we  obtain  almost  the  result  we  need, 
but  without  the  second  term  on  the  left-hand  side  of  (17).  Obtaining  (17)  requires  a  more 
careful  argument,  which  is  given  in  [5].  n 
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